-
Notifications
You must be signed in to change notification settings - Fork 6.8k
fix memory corruption in gluoncv ssd training #13727
Conversation
LGTM, thanks @arcadiaphy |
@@ -785,7 +785,7 @@ void BipartiteMatchingForward(const nnvm::NodeAttrs& attrs, | |||
.get_with_shape<xpu, 2, DType>(Shape2(batch_size, col), s); | |||
Shape<1> sort_index_shape = Shape1(dshape.Size()); | |||
index_t workspace_size = sort_index_shape.Size(); | |||
workspace_size += ((sort_index_shape.Size() * sizeof(int32_t) - 1) / sizeof(DType)) * 2; | |||
workspace_size += (sort_index_shape.Size() * 2 * sizeof(int32_t) - 1) / sizeof(DType) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my naive understanding. But, I do not understand why +1? And this mean, we will have one extra size in the workspace always?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will have extra size sometimes when we need to hold int32_t data in DType workspace, and the calculation assures that only the smallest needed space is allocated.
if workspace = (N * sizeof(int32_t) - 1) / sizeof(DType) + 1,
then workspace * sizeof(DType) >= N * sizeof(int32_t) - 1 - (sizeof(DType) - 1) + sizeof(DType)
= N * sizeof(int32_t)
Actually, L426 has already calculated correctly.
Is it better to create two workspaces for DType and int32_t respectively? |
workspace allocation is limited to one time only, so total space has to be calculated aforehead |
@zhreshold Thanks! |
@zhreshold Is this PR good to go? |
Description
Fix #13710
In bipartite match, the temporary workspace size is miscalculated and not enough to hold the index tensor.
Since it is related to
sizeof(DType)
andsizeof(int32_t)
, the bug is only triggered on some platforms.Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments